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Abstract. A procedure is described for estimating an optimum kernel for the detection by convolution of signals among 
Poissonian noise. The technique is applied to the detection of x-ray point sources in XMM-Newton data, and is shown to yield 
an improvement in detection sensitivity of up to 60% over the sliding-box method used in the creation of the IXMM catalog. 
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1. Introduction 

Over the past few years, CCD cameras on satellite observato- 
ries such as ROS AT, ASCA, Chandra and most recendy XMM- 
Newton have generated high-resolution digital images of the 
x-ray sky which are characterized by relatively faint back- 
ground (for example, background fluxes of less than 1 count 
per CCD pixel are often seen in typical-duration XMM-Newton 
exposures). The number of events per pixel follows a Poisson 
probability distribution which, at such low flux levels, deviates 
markedly from its Gaussian bright-end limit. 

It seems likely that a significant part of the (non- 
instrumental) x-ray background is comprised of point sources 
too faint to be distinguished from one another by present tech- 
niques (Mushotzky et al l2000l Hasinger et al l2001> . The desire 
to characterise this population of sources is one reason for at- 
tempting to push the sensitivity of x-ray point source detection 
to the lowest limits allowed by the data. 

Several source-detection procedures have been applied 
to x-ray images, eg: sliding-box (DePonte & Primini 119931 



Dobrzycki et al l20001 see also task documentation for the exsas 
task detect and the SAS task eboxdetect); wavelet (Damiani 
et al [T997l Starck & Pierre [T9981 Pierre et al l2004l see also 
task documentation for the CIAO task wavdetect and the SAS 
task ewavelet); maximum-likelihood PSF fitting (Cruddace et 
al ll987l Boese and Doebereiner 1200 1 l l : and Voronoi tessella- 
tion (Ebeling & Wiedenmann I 1 9931 . There are also occasional 
references to use of a 'matched filter' technique, but in the x- 
ray sphere at least this appears to consist just of convolution 
by the Point Spread Function or PSF (Vikhlinin et al 1 19951 
Alexander et al 20Q3 > . A similar technique has also been ap- 
plied to ASCA data (Ueda et al ll999> . As is shown in the 
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present paper, the PSF approaches a true matched filter in the 
limit of white Gaussian noise but is sub-optimal at low levels 
of background. 

A useful review of source-detection procedures as applied 
to x-ray images can be found in Valtchanov et al (2001 ). 

Many of these techniques include a step in which the raw 
image is subjected to a convolution, often as a first step in 
preparing a detection-likelihood map. Even PSF fitting, in sim- 
plified form, can be shown to be equivalent to a convolution 
(Stetson ll987ll l. Voronoi tessellation seems to be the only tech- 
nique which cannot readily be understood in this form. 

Extraction of source positions and detection likelihoods 
from the convolved image is in general not simple, because 
the characteristic width of the PSF of the mirror system is usu- 
ally larger than the CCD pixel size. In a (raw) image where 
the PSF is resolved, neighbouring pixels are not statistically in- 
dependent - detection of some source flux in one pixel makes 
it more likely to detect it in neighbouring pixels. However, if 
the null hypothesis (ie, that there are no sources in the field) 
is assumed, detected counts are just due to background. But 
the background (by definition) must be slowly varying over the 
scale of the PSF, otherwise it would be impossible to separate 
it from the sources; and in the limit of smooth background the 
event counts in the raw image are statistically independent. 

On the other hand, as has already been pointed out, a signif- 
icant fraction of the cosmological x-ray background (perhaps 
approaching 100% at energies above 1 keV) actually consists 
of sources. This seems in direct contradiction to the null hy- 
pothesis. It is shown however in appendix 1X1 that the bulk of 
these background sources must be very faint, and that for x- 
ray telescopes of presently achievable effective areas, one can 
model the net contribution of this source population by a rela- 
tively sparse population of brighter sources superimposed upon 
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a smooth background. I conclude therefore that it is acceptable 
to test the null hypothesis, thus in effect to detect sources, on a 
pixel-by-pixel basis. 

Some source-detection chains which assume the null hy- 
pothesis have adopted the following simplified scheme: 

1 . Determination of the expectation value of background at all 
pixels; 

2. Convolution of the raw image; 

3. Calculation of a null-hypothesis likelihood map, which is 
just the likelihood, given the assumed background, of each 
value of the convolved image arising from it by chance; 

4. Location of sources by centroiding of troughs in the likeli- 
hood map; deaUng with confused sources; parameter mea- 
surement, etc. 

The purpose of the present paper is to describe methods of 
calculating and applying a 'matched filter', that is a convolver 
which is optimized for the detection, on a pixel-by-pixel basis, 
of sources of a given PSF against a given background. Only 
steps 2 and 3 of the above sequence are considered - ie, it is as- 
sumed that a good estimate of the background is available, and 
that source centroiding, confusion resolution etc in the likeli- 
hood map may be independently optimized. 

The matched-filter detection scheme is compared with the 
sliding-box technique, specifically that used in the construction 
of the IXMM catalog (Watson et al 2003>l . The sliding box 
technique is arguably the simplest and most widely used of the 
detection techniques and thus provides a convenient baseline 
for comparison. 

2. Linear signal detection 

2.1. General 

Suppose we have a parent function C which is a function of 
some independent variables x (eg time, position, energy) and 
which comprises a background B and signal S such that 

C{x)^ B{x) + aS{x-XQ) (1) 

where 




dx S(x) - I, 



a is the signal amplitude and xq is a reference point which 
serves to locate the signal in x space. Let each experimental 
measurement of C return a random variable c which is dis- 
tributed according to a probability distribution with an expec- 
tation value (c) = C. A common problem in signal detection 
occurs when one one knows (or can estimate) B{x) and S {x) a 
priori and one wishes to calculate the most likely values of a 
and jfo from a set of samples c, of C at different values of x. ' 
In the case of point source detection, jc is a two-coordinate vec- 
tor which specifies position in the focal plane of a camera, S is 

' For simplicity I'll assume in this section that there is only one sig- 
nal S to be found, although, since convolution filtering is a linear pro- 
cess, in principle there is no difficulty in detecting many superposed 
signals provided they have sufficiently different values of jcq. 



the point spread function (PSF) of the camera, and the samples 
Cij are measurements of flux made on a pixel grid in the focal 
plane. 

As outlined in the introduction, to assess whether there is 
any signal present in a given channel / one calculates the prob- 
ability of the parent background B, alone generating either the 
observed count c, or any value higher than this. This is called 
the probability of the null hypothesis (/"null)- A signal is judged 
to be 'detected' in channel / if the null probability of the mea- 
sured c, falls below a previously selected cutoff' value. 

It is very often the case in practice that S extends over sev- 
eral channels. In this case it is usually possible to improve the 
signal-to-noise ratio in at least one of the channels spanned 
by the signal, hence the detection sensitivity, by performing 
a weighted sum of the counts measured over several adjacent 
channels. In the XMM-Newton case which we are going to 
consider, x extends along spatial dimensions x and y and energy 
dimension E. The weighted sum is to be computed for each 
spatial pixel (/, j), the eventual aim being to produce a map of 
the null-hypothesis likelihood at each (;, j). For computational 
purposes the spatial sum is most conveniently expressed as a 
convolution; the actual expression employed to calculate the 
weighted sum for each pixel is therefore 

M M Afbands 

"222 '^P^1^kCi-p,j-q,k- (2) 

For present purposes however it is unnecessary to retain such 
complication. As far as the statistical analysis goes, equation 
13 at any given spatial pixel is simply a weighted sum of 
random variates: 

N 

c' = ^ WiCi. (3) 
1=1 

It is also convenient for the present to assume that the signal is 
bracketted between / = 1 and / - N, or, in other words, that the 
sampled parent function C, is given by 

Ci - Bi + aS i. 

This is equivalent to assuming that a source is centred on pixel 
(/, j) of equation|2 

The standard deviation cr' of c' as given in equation |3] is 
estimated from the usual eiTor propagation relations to be 

N 

(4) 

i=I 

where cr, is the standard deviation of c,. 

Clearly, whatever weight scheme is adopted, the stronger 
a signal is (ie, the larger the signal amplitude a), the higher 
the probability that it will be detected (the smaller the value of 
Pnuii)- Ideally we would like to be able to calculate some cutoff 
value of a, above which the signal definitely would be detected, 
and below which it definitely would not. It is however not pos- 
sible to do this, for the reason that the any non-trivial function 
c' of the detected counts must itself be a random variable. A 
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given value of a will not always give rise to the same value 
of c', and any value of a will produce any chosen value of c' 
within a sufficiently large ensemble. To get around the problem 
and so to permit the comparison of different choices of weights 
it is convenient to define a 'counts amplitude' /3 such that 



c'-B' 



(5) 



where the significance of the primes here is that, for any func- 
tion /, 



It is not hard to show that (fi) - a. 

Since any given null-hypothesis probability is associated 
with a definite value of the weighted sum of counts c' , it is 
also therefore unambiguously associated with a definite value 
of p. This allows us to define the detection sensitivity ySjet of 
any convolution as that value of /? which is associated with the 
chosen detection-cutoff value of Pnuii- 

Clearly it is also the case that, for a given B, and 5,, there 
will be a set of weights (not necessarily unique) which gives 
maximum sensitivity, ie the smallest possible value of p^^i. 

2.2. The Gaussian case 

For purposes of comparison I'll briefly reprise the well-studied 
case where the probability distribution of c, for each / is 
Gaussian, with constant cr, = cr. In this case any weighted sum 
returns c' values which also have a Gaussian distribution, with 



One can calculate the signal-to-noise ratio snr as follows: 



c'-B' 



2.3. Weighted- Poisson case with a single weight 

In the case where the probability distribution of the observed 
values Cj is Poissonian, the optimum weights are not so easy to 
come by, and in general will depend in a non-trivial fashion on 
the level of background Z?, . 

Consider first the simplest case, in which there is only one 
weight (A^ = 1 in equation|3}- The solution in this case is trivial, 
but provides a useful mathematical template for the more dif- 
ficult case in which N > 1. Let c be a random (necessarily in- 
teger) variable which has a Poisson probability distribution. A 
second variable c' which is just c multiplied by a single weight 
w retains the Poisson probability distribution 



c! 

but with V now given by 
<c'> 



(6) 



In the null hypothesis, (c) - B and thus (c') = wB = B'. 
If we extrapolate from the unsealed Poissonian case it is also 
clear that the null-hypothesis probability Pnuii is given by 



(c' (c'}\ Ic' B' 

\w w I \w w 



(7) 



where Q is the (complementary) incomplete gamma function, 
defined by 



dte-'f-\ 



snr — 



Q(a, x) = ^— r 
F here represents the gamma function. 

2.4. Weighted-Poisson case with N > I weights 

Formally speaking, we are now no longer in the Poissonian 
regime, since a weighted sum of two or more Poissonian vari- 
ates does not itself in general have a Poissonian probability dis- 
tribution. I have not been able to find a closed-form expression 
for the probability density function in this case. However, two 
empirical approximations are presented in the present subsec- 
tion. 



The null-hypothesis probability is then given by 
Pnuii(snr) = 0.5 (l - erf [snr/ V2]) . 

By differentiating snr with respect to the weights w, and equat- 
ing each of the resulting derivatives to zero one arrives at the 
well-known result that the optimum set of weights is propor- 
tional to the signal itself: ie 

Wi - kS jV i, 

where k is some non-zero constant. 



2.4.1 . The Fay and Feuer approximation 

Fay and Feuer ( 19971 suggested on heuristic grounds that the 
null-hypothesis probability in this general case might be given 
to a good approximation by equation with w replaced by 
an appropriate equivalent weight Wequiv Their prescription for 

^equiv? 



^equiv " 

where 



i=l 
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and, from equation^] 



is obtained essentially by equating respectively the first and 
second moments of the single-weight and many- weight proba- 
bility functions. 

In fact when one compares the integrated f nuii(c') function 
derived from the Fay and Feuer approximation against Monte 
Carlo data (see figureQ, one sees that the Fay and Feuer curve 
seems to be displaced too far to high c' . Essentially this is be- 
cause equation0defines a continuous envelope function to the 
actual discontinuous, stepped Pnun of a single Poisson variate. 
Consider now a weighted sum of Poisson variates, but with 
all the weights having the single value w. here the sum itself 
remains a Poisson variate, in which case the true integrated 
probability distribution remains coarsely stepped as in figure 
[l] the envelope to this being exactly given by equation with 
Wequiv = The coarscncss is because there are no combina- 
tions of c, which can produce any value of c' other than c' = yw 
for integer y; the steps in the integrated probability distribution 
PnuU are thus of width w. If we now allow the weights to be 
randomly perturbed by small amounts, the effect is to smear 
out the steps: ie, a range of values of c' close to iw now become 
possible. Where the weights vv, are allowed to become entirely 
random and independent (and are sufhcently numerous), the 
coarse steps disappear entirely^, and the probability curve ap- 
pears to steer a middle course (eg the dotted line in figure ^ 
through the coarse steps of the single-weighted Poisson distri- 
bution of the same equivalent weight. It seems clear then that 
the approximation formula suggested by Fay and Feuer could 
be made more closely applicable to the general case by shifting 
it towards lower c' values by an amount 0.5wequiv The resulting 
formula is 



^„uii(c') = 1 - e 



1 B' 



^equiv 2 Wequiv 



(8) 



Figure Q] shows an example of a P^^w distribution derived 
from a Monte Carlo exercise (dotted line). The Monte Carlo 
ensemble consisted of 10* values of the weighted sum c' . 25 
random weights vv, were chosen before the start of the exercise: 
these initially had a uniform probability distribution between 
and 1 but were then normalized so that they summed to 1 . (The 
only effect of this normalization is to change the x-axis scale.) 
For each member of the ensemble, a Poisson-random integer c, 
was generated for each of the 25 bins, using a constant back- 
ground value B - 0.3 counts per bin as the Poisson expecta- 
tion value. The weighted sum c' of these 25 random values was 
made and added to the ensemble. The cumulative probability 
was then formed by summing, from high towards low values 
of c', a 100-bin, normalized histogram of the ensemble values. 



^ Note that the true probability curve in the case that N > I still 
falls in stepwise fashion, since the possible values of c' in this case 
are still discrete; it is just that the steps are much more finely spaced, 
since the spacing between possible c' values (and thus also between 
steps) decreases on average at a rate proportional to (c')'"". 
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Fig. 1. A comparison between various weighted-Poisson inte- 
grated probability functions. The dotted line shows the results 
of a Monte Carlo experiment in which c' was formed as a 
weighted sum of 25 independant Poisson variates with an ex- 
pectation value B of 0.3. The value of the equivalent single 
weight Wequiv = cr'^/B' was 5.97 x 10"^. The remaining two 
curves refer to a single Poisson variable c' of the same expec- 
tation value 0.3, weighted by Wequiv The solid stepped line rep- 
resents the true probability Pnun(c') of obtaining a measured 
value equal to or greater than c' in the single-weight case. The 
dashed line is the envelope function to this Pnuiiic'), as given 
by equation^ 



The modified Fay and Feuer curve (equation |8} has not been 
plotted, but its path can be easily visualised by mentally shift- 
ing the dashed curve to the left by half the width of the coarse 
steps. 

Empirical tests with randomly-chosen weights and back- 
ground suggest that equation |8] is a reasonable fit to actual 
distributions of c' for values of null probability greater than 
about 10"^. Better fits were observed when the distribution of 
both weights and background was even, and B' was greater 
than about 0.1. At values of Pnuii smaller than about 10"^, test 
data appear to diverge from equation |8] such that the actual 
null probability at a given value of c' is larger than the pre- 
dicted value. The divergence appears to worsen at low values 
of background, or if the weights are not very homogeneously 
distributed. Comparison of the respective expressions for the 
third moments fij - {{c')^) of the single-weight and multiple- 
weight distributions shows that yU3,muiti is always greater than 
yus.singie for positive, nonequal w, . It is therefore almost certain 
that all the higher moments differ as well. If this is the case 
a divergence at high c' between the many- and single-weight 
distributions is to be expected. 

Attempts to fit a function of the form of equation |8l to test 
data were not satisfactory - that is, the best fit was still clearly 
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Weighted counts c' 

Fig. 2. An example plot comparing approximate expressions 
for the null probability f null for a weighted sum c' of Poisson 
variates at lower values of f null than were explored in figure^ 
a) The dashed line plots the result of a Monte Carlo experiment 
in which the weights were optimized for detecting sources in 
XMM-Newton EPIC images, as described in section l3'.2.1l The 
solid black line gives the modified Fay and Feuer approxima- 
tion of equation|8j the solid grey line, the approximation of 
equation|9] b) This figure, with the same horizontal scale as a, 
shows ratios between probability densities p. The solid black 
and grey lines show respectively the Fay and Feuer and P 
divided by the Monte Carlo p. 

not a good approximation to the parent distribution at low f null- 
Since source detection is necessarily concerned with low values 
of the probability of the null hypothesis, it is desirable to find a 
better approximation in this region. 

2.4.2. Jhex^ approximation 

The -like integrated probabiUty distribution 

^null(c') = e( — ) (9) 
\ ^equiv ^equiv / 

proved to be an acceptable approximation down to at least 
^nuU = 10"^ over a wide range of background values. An ex- 
ample is shown in figure |2 

The Monte Carlo simulation used to generate the results of 
figure|2]was similar to that of figure[2 It consisted of an ensem- 
ble of 10^ weighted sums. The weights in this case constituted 
a matched filter for detection of sources in images with an aver- 
age background of 0.1 counts/pixel, the source shape being the 



on-axis point spread function at 1 .25 keV of the XMM-Newton 
EPIC PN camera. 

It might be possible to improve the fit of equation |9l by 
choosing a different value of Wequiv, but this possibility has not 
been explored. 

It is not at present known why the x^ cumulative distribu- 
tion given in equation|9]seems to provide such a good model of 
the weighted-sum fnuii- 

2.4.3. Consequences of divergence of an 
approximation 

Suppose one inverts a formula such as equation [S] in an at- 
tempt to deduce the detection sensitivity at a given value of 
^'nuii- What in particular goes wrong if the formula is not a 
good approximation to the true probability distribution? The 
answer is that the returned sensitivity value is coiTect, but the 
assumed f null, which controls the number of false positives ex- 
pected, will be incorrect. In the example of figure [5^, cutting 
the source list at an apparent null likelihood of 8 will yield a 
sensitivity of about 0.31 weighted counts if the Fay and Feuer 
expression is used. The true null likelihood at that value of c' 
is that of the Monte Carlo, ie about 6.7. This means that about 
3.7 (= e^^^-^) times more false positives will be encountered 
than expected. The;i'^ formula on the other hand diverges from 
the Monte Carlo data in the other direction. Use of this formula 
gives a conservative result, being apparently slightly less sen- 
sitive at c' - 0.36, with the true null likelihood cutoff of about 
8.6 yielding approximately 0.55 fewer false positives than ex- 
pected. In order to get something like the desired rate of false 
detections, one must skew the apparent null likelihood cutoff, 
to about 10 in the Fay and Feuer case and 7.5 in the x^ case. 
The sensitivity in both cases will then be 'correct' at about 0.34 
weighted counts. 

2.5. Optimization of the weights 

As described in section IZTI for a given background B and sig- 
nal template S , any given value of the weighted sum of counts 
c' is associated with a unique value of the null-hypothesis prob- 
ability f null- In the preceding subsection two equations (jSjand 
|5Jl were given which approximate this relation for the situa- 
tion in which the observed data are random Poisson variates. In 
order to calculate optimum weights in this case we must first 
invert an equation of this form to obtain c', then invert equation 
|5]to obtain the counts amplitude /3. 

Inversion of either equation|S]or|5]to obtain c' involves an 
inversion of the incomplete gamma function Q. For the sake of 
practicality let us define the two inverse functions 2^' and Q^^ 
as follows: if 

P = 1 - Q(a,x), 

then let 

a = Q]\l-P,x). 
and 
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No closed-form expressions for either Q^^ or seem to be 
known; for present purposes I have performed the inversions 
numerically by means of a Ridders-method routine (Ridders 
\W7^ I as given in Press et al (.2003j . 

After insertion of the appropriate inverse gamma function, 
the counts amplitude /3 is thus given for the modified Fay and 
Feuer approximation by 



equiv 



and for the x approximation by 



-B' 



(10) 



equiv 



G2'(PnuU,;^) 



-B' 



(11) 



The inputs to these formulae are (i) the set of weights, (ii) 
the value of f null, and (iii) the background and signal-shape in- 
formation. Only the first two of these are under our control. 
Before we may begin to seek for the optimal set of weights, we 
must choose a value of PnuU- The value we choose should be 
governed by the maximum fraction of false detections we are 
prepared to tolerate. The desired cutoff value Pdet of null prob- 
ability can be obtained from the maximum acceptable number 
"false of false detections by dividing by the number of 'beams' 
in a typical image, which is just the image solid angle divided 
by the 'beam' solid angle. In the present case, in which images 
are convolved before the null hypothesis is tested, the beam 
solid angle must be some kind of equivalent solid angle of 
the appropriately convolved PSF. For XMM-Newton at least, 
where the shape of the PSF varies across the field of view, its 
average value is probably best estimated via Monte Carlo trials. 
If however, for the sake of obtaining at least a rough approx- 
imation to the ratio between f det and Wfaise, we use the value 
2.34 X 10"^ deg^ calculated in appendix IaI for the equivalent 
solid angle of the on-axis EPIC PN PSF as a lower limit to the 
beam Q, we find that a null -probability cutoff of exp(-8.0), the 
value used in the making of the IXMM catalog, coiTesponds to 
at most 2 expected false detections per image. 

Let us define jSjet as the counts amplitude /3 which, for a 
given set of weights, coiTesponds to the chosen value of Pdet- 
j6(jet can be viewed as the amplitude of a source which is just 
detectable under these conditions. The optimum weights are 
then clearly those which yield the smallest value of /3dsf 

In the present study, Powell's direction-set method as mod- 
ified by Press et al (Press et al 2003, chapter 10.5) was used 
to optimize sets of weights by minimizing ^Sdet, as defined by 
either equation[TO|or^2 as a function of the weights. 

3. Detection of x-ray point sources in XI\1l\1-Newton 
data 

The EPIC x-ray cameras of XMM-Newton are described in 
Striider et al L2001 ) and Turner et al (2001 1. There are three 
cameras: the telescope in each case is similar but two of the 
CCD detectors comprise 7 chips of MOS type, whereas the 
third has 12 chips of pn composition. The 'good' area of each 



of the three cameras occupies about 94% (MOS) and 82% (pn) 
of a 30' diameter field of view. CCD pixel dimensions are 1.1" 
square (MOS) and 4.1" square (pn). 

3.1. The source-detection strategy used for the 1XMM 
catalog 

For each exposure, images in sky coordinates were made in 
five separate energy bands. The images had square pixels of 
4x4 arcsec dimension. Images were made by transforming the 
position on the detector of each selected x-ray event into sky 
coordinates, then binning up the events into the image pixels. 
The position of each event was dithered within the boundaries 
of the CCD pixel in which it was detected. Variations over time 
of the spacecraft attitude were also taken into account. 

Source detection was performed on the five images in par- 
allel. The source detection comprised a convolution and detec- 
tion stage (steps 1 to 3 of the sequence described in the in- 
troduction), followed by a source-parameterisation stage (step 
4 in the sequence). The detection etc stage was performed by 
the XMM-Newton SAS task eboxdetect, the parameterisation 
by emldetect. Both stages involve the calculation of a detection 
likelihood; since the value calculated by emldetect is arguably 
more sensitive than that of eboxdetect, the ideal procedure is 
to run eboxdetect with a deliberately low detection threshold, 
submit the resulting long list of source candidates to emldetect 
and to then accept as genuine sources only those for which the 
emldetect detection likelihood exceeded a second, more rea- 
sonable threshold. The eboxdetect threshhold should not be so 
close to the emldetect threshhold that the two selections inter- 
fere. In IXMM practice there is some doubt as to whether the 
two detection threshholds were sufficiently far apart. For this 
reason, and because I don't understand enough of the emlde- 
tect likelihood calculation to be able to replicate it, I have in 
the present paper only considered the sliding-box stage of the 
IXMM detection procedure. 

The first step of this procedure was to make maps, 1 per en- 
ergy band, of the estimated background in each pixel. The five 
images and five background maps were then each convolved 
with a square, 5x5 array of unit values.-' In mathematical form 
this processing can be represented as follows: 



p=-2 q=-2 



2 2 
p~—2 q-—2 

where / and j indicate the position on the image pixel grid and 
k refers to the energy band. 



Two further rebinning and convolution steps, approximately 
equivalent to convolution respectively by 10 x 10 and 20 x 20 arrays, 
were also performed. However, the principal purpose of these extra 
steps was to detect extended sources: therefore they can safely be ne- 
glected for purposes of the present discussion. 
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Sum of likelihoods L 

Fig. 3. Example probability curves relating to the IXMM 
source-detection technique. The dotted line is the result of a 
Monte Carlo experiment (described in the text) which gener- 
ated an ensemble of values of the summed likelihood L de- 
scribed by equation^! The solid line gives the null-hypothesis 
probability predicted by equation The dash-dot line is the 
function 2(5.81/2, L), which was fitted to the Monte Carlo data 
by the procedure described in the text. Note that, for easier 
comparison, the vertical scale and the dynamic range of the 
horizontal scale are the same as those in figure|2 

The overall null-hypothesis probability was calculated as 
follows. Firstly, because all the weights are equal to 1, c' re- 
mains a Poissonian variate; equation0is therefore exact at per- 
mitted values of c', with 

W - Wequiv - 1 ■ 

The probabilities Pij^t returned by equation were converted 
to likelihoods (ie, negative logs of the probabilities). For each 
image pixel, a summed likelihood L, ^ was then generated: 

M 

Ly = -2ln(Py;,), (12) 

k=\ 

where M is the number of energy bands (here 5). The prob- 
ability distribution of L is however open to some question. 
Bevington and Robinson ( 1992 1 show that a sum of the same 
form as equation [T2l is distributed approximately as;t'^/2 (up 
to an additive constant) with v - M degrees of freedom. Wilks 
J1963> as cited in Cash J1979> comes to similar conclusions. 
Perhaps for this reason the authors of the XMM-Newton SAS 
task eboxdetect, which performed the calculation of detection 
likelihoods for the IXMM catalog, used the following approx- 
imate formula for the integrated probability of the null hypoth- 
esis across 5 bands: 

nuii,u = e(5,L,v). (13) 

Note however that the first term represents 1/2 the degrees 
of freedom, hence should be 2.5 not 5; also, the P in equation 
I12lare integrated probabihties, not probability densities. 



Although a full analysis of the IXMM source-detection 
technique is beyond the scope of the present paper, a Monte 
Carlo simulation of the statistical fluctuation of background 
was performed in order to check the accuracy of equation [O] 
An ensemble of 5 x 10^ values of L was accumulated. The 
procedure for each member of the ensemble was as follows. 
For each of the 5 bands, a 5 x 5 array of Poisson-random in- 
tegers was generated. To calculate the expectation value Bi jk 
for pixel (/, j) in the A:th band, the k\h normalized background 
weight for the XMM PN camera, as listed in tabled was multi- 
plied by 0.1 counts/pixel, was calculated by summing the 25 
counts values of the fcth band, Z?^ being of course simply equal 
to 25 X Likelihoods were then generated for each band by 
use of equation^ and summed to give L. 

The accumulated histogram of the resulting data, shown in 
figure |3] shows that there is a significant difference between 
the Monte Carlo results and the prediction of equation 
Attempts were made to fit a variety of functions to the Monte 
Carlo data. One must of course fit to the raw histogram data, 
the adjacent channels of which are statistically independent, 
rather than to the integrated curve displayed in figure|3l It is not 
easy to fit a smooth function to the Monte Carlo data. Partly 
this is because the raw data are rather 'noisy', particularly at 
low values of L, a phenomenon which arises because in this 
range the input counts values to each of the five channels are 
usually small integers which, when combined, give rise to an 
sparse distribution in the allowed resulting L values. The ef- 
fect of this can be seen in the jumpy nature of the accumulated 
Monte Carlo data shown in figure |3 It is also not straighfor- 
ward to define a statistic to be minimized in order to generate 
the fit. A straightfoward sum of squared residuals, or even a chi 
squared sum (ie, sum of squared residuals, each divided by the 
variance in that channel), tends to result in the low-L part of the 
data dominating the fit. This is undesirable if one is interested 
in minimizing false detections due to statistical fluctuations in 
background, in which case intermediate values of L are more 
important. In the end 1 chose arbitrarily to minimize a sum of 
terms 

where / is the function to be fitted, y represents the Monte 
Carlo data and cr'* is the square of the variance. The fit was 
however restricted to values of L less than 15, to avoid statisti- 
cal noise in the Monte Carlo values at high L. 

There is an infinity of functions one could choose to fit to 
the data, but in fact a good fit as shown was obtained simply by 
allowing the first term v/2 in the Q function in equation to 
vary. The best fit, shown in figure|3] occurred at v = 5.81. This 
however suggests no obvious systematic coiTection to equation 
[2]and, since 1 have at this time no better analysis of the prob- 
lem to put forward, and since the unmodified equation^] was 
in fact used in the IXMM source detection, 1 have retained it 
for comparison with the matched-filter method. It seems clear 
however that the rate of statistical false detections in IXMM 
was probably lower than originally estimated, and thus that a 
lower detection cutoff could have been used in that survey with- 
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out penalty. There appears to be no better way to correct esti- 
mates of the sensitivity of this method at the moment than by 
caHbrating it via several Monte Carlos performed at different 
values of background (see eg section|^^3). 

A final point to note about the IXMM detection procedure: 
it did not make use of the fact that, for any given exposure, im- 
ages from more than one of the XMM-Newton x-ray cameras 
were usually available. All source detection was performed in- 
stead on a camera-by-camera basis. Clearly one would expect 
a technique which made use of the parallel information to yield 
an improvement in detection sensitivity. 

3.2. Matched filters for XMM-Newton source detection 

There are three obvious ways to improve on the IXMM source- 
detection procedure by use of matched filters (weighted sums). 
Firstly, taking one energy band at a time, since we know to a 
reasonable approximation the point spread function (PSF) of 
the XMM-Newton telescopes, we could choose in each energy 
band a set of weights optimized for detecting that shape of sig- 
nal; secondly, we might do the same thing across the energy 
bands by the expedient of assuming a common spectrum for 
the sources; thirdly, we might in similar fashion add together 
images from the three x-ray cameras of XMM-Newton. Only 
the first two alternatives are explored in the present study. 

3.2.1 . Examples of matched filters for 1 energy band 

We take a square aiTay of dimension 2N + 1 . Let us assume a 
parent function Cjj as follows: 

dj = Bij + aS ij, V /, j in [-N, N] 

with the normaUzation condition 

N N 
1=-^ j=-N 

For simplicity, the background B is assumed to be constant 
across the array. Let S ij be the PSF on the optic axis of the FN 
telescope of XMM-Newton, at an energy of 1 .25 ke V (the mean 
energy of band 2 as defined in the IXMM catalog (Watson 
et al'2003 1). The PSF model used in the present exercise was 
originally calculated via a ray-tracing approach (Gondoin et al 
I1998> . and is the same that was used to determine positions and 
fluxes of the sources in IXMM. In the present exercise the PSF 
was centred on the middle of the (/, j) - (0, 0) pixel."* 

We assume that we have an array of measured count values 
Cij across the array, each of which is a random, Poisson vari- 
able, with (cij) - dj. We make a weighted sum of the Cij as 
follows: 

N N 

i=-N j=-N 

In practice, the centre of the PSF for any real source can of course 
be located anywhere within the 'central' pixel. A more rigorous pro- 
cedure would take this into account. However, because it is not clear 
how best to do this, the simpler assumption has been made for the 
present study. 



and likewise for B' . 

The two cases we want to compare are, firstly, the IXMM- 
like case in which the w,j = 1 and N - 2; secondly the case 
in which the Wij are optimized according to the procedure de- 
scribed in section l231 But before the second procedure can be 
used, a value of must be chosen. It is not hard to see that, in 
principle, the larger the the better To see this, suppose we 
compare two convolvers: one optimised on a {2N + l)-square 
array, the other on a (2{N- 1)+ I)-square array. The optimized 
values of the small convolver can be thought of as a subset of 
the possible values of the large convolver; one just sets the extra 
ring of pixels to zero. So, if the smaller convolver were a bet- 
ter source-finder than the large, the optimisation routine would 
have set the outer pixel values to zero automatically, giving rise 
to the same sensitivity of detection with the large as with the 
small. Thus a small optimized convolver can never be better 
than a large one; and the only limiting factor to becomes 
computational practicality. A value of = 4 for the optimized 
convolvers has been chosen as the largest value which can be 
processed in the 5-band case (see section l3'.2.2> in a reasonable 
time. 

The difference in size between the IXMM and the opti- 
mized convolvers makes it more difficult to compare their ef- 
ficiency. It seems a bit pointless to hobble the optimized con- 
volver by restricting its size to the IXMM 5x5- after all, the 
whole aim of the exercise is to achieve the maximum practical 
improvement in sensitivity. It might be argued though that the 
discrepant sizes are unfair to the IXMM 'box' convolver - if 
bigger convolvers are sometimes better, might not much of the 
gain in going from a 5 x 5 box-type to a 9 x 9 optimized simply 
be due to the increase in size? It turns out however that box- 
type convolvers larger than 5x5 perform uniformly worse than 
the 5 X 5, as is shown in figure 0] in fact, if we go in the other 
dkection, to a 3 x 3 box, we get a slightly improved perfor- 
mance at all but the lowest background fluxes. Hence one can 
conclude that the comparison between convolvers of different 
size is probably being rather kind to the IXMM algorithm than 
otherwise. 

Some examples of optimized 9x9 kernels Wjj are compared 
to the PSF S ij in figure [S] One would expect that w — > 1 as 
B — > (all the counts are equally valuable) and w — > 5 as 
Z? — > oo (Gaussian limit). This is consistent with the form of 
the high- and low -background kernels shown in the figure. 

The quantity which should be compared is the counts am- 
plitude ySdet (as given for example in equationFTOt at which the 
signal is just detectable - that is, at which the resulting null- 
hypothesis probability is just equal to some previously decided 
cutoff Pdet- Since the IXMM sources were detected at a prob- 
ability cutoff of exp(-8.0) (equivalent to about a 4.3-sigma de- 
tection), this is the value that was chosen for the present exer- 
cise. 

jSdet as a function of background B is plotted in figure |6l 
for both the IXMM and the matched-filter procedures. For the 
latter, since there is no exact formula for the null-probability 
distribution P„uii, the calculated sensitivity depends on the ap- 
proximation used to represent f null- Shown on the figure are 
results (at finely-spaced values of background) of using respec- 
tively the Fay and Feuer (equation |8j and the (equation |9j 
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Fig. 4. Sensitivity of different convolvers as the array size is 
varied. The array side length is 2N + 1 . The squares represent 
the IXMM 'box' convolver. The optimized convolver is rep- 
resented by both crosses and filled triangles: the crosses show 
the estimates obtained by using the modified Fay and Feuer ap- 
proximation (eauationllO. whereas the triangles show the esti- 
mates from the chi-squared formula (eauation ll 11 1. Those cases 
in which the sensitivity of the optimized convolver appears to 
be worse than the box-type, and where the two approximate 
formulae for /3 give widely differing results, are because the 
approximations have not here been corrected by Monte Carlos 
as in figure|6| 

approximations, as well as four points (diamonds) where the 
true PniiU has been estimated via Monte Carlos of 10^ iterations 
each. 

As described in the introduction, several of the x-ray source 
detection procedures to be found in the literature include a 
step in which the raw images are convolved with the telescope 
PSF The PSF is an optimum convolver in the Gaussian limit 
but may be expected to depart from the ideal at low values 
of background flux. To investigate this, the detection sensitiv- 
ity obtained by use of the PSF as convolver was calculated at 
four levels of background. The resulting sensitivities, corrected 
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Fig. 5. Contour plots of the XMM-Newton PSF compared to 
two corresponding optimized convolution kernels, a) The on- 
axis PSF for the XMM-Newton EPIC PN camera at 1.25 keV, 
rebinned to 4 arcsec square pixels, b) Optimized convolution 
kernel for background B = 10 counts/pixel, c) Same, but op- 
timized for background B - 10 counts/pixel. All aiTays were 
normalized before plotting. The main contour separation (solid 
contours) is 0.01; the lowest of these intervals has again been 
divided into five (dashed contours). 



via use of Monte Carlos to estimate Pnuih are plotted as black 
squares on fig|6l 

Several conclusions can be drawn from figure|6l Firstly, for 
this form of signal, the Fay and Feuer approximation appears to 
be unserviceable for all but the highest values of background. 
The formula performs much better, accurately representing 
the true data down to about 0.2 counts/pixel background, be- 
low which it begins to diverge. Clearly though the matched- 
filter approach yields a better sensitivity than the 5x5 box con- 
volver at all values of background, apparently asymptoting to 
about 25% better at high background, the advantage decreasing 
to zero at low. The PSF appears to be a useful approximation 
to the optimum convolver for background levels greater than 
about 0.03 counts s"' . 

The PSF of the XMM-Newton EPIC cameras becomes az- 
imuthally distorted with distance from the centre of the field 
of view. It is therefore of interest to repeat the above exercise 
for an example of the off-axis PSF. In the present case, a PSF 
at 850 arcsec from the edge of the optical axis (94% of the ra- 
dius of the field-of-view of the EPIC cameras), at an azimuth 
of 45°, was arbitrarily chosen. All other variables were retained 
unchanged. No Monte Carlos were performed in this case, and 
only the results of the;^'^ formula were used. The results can be 
seen in figures0and|8] 

Comparison between figures|6land|8lshows that the degree 
of improvement to be gained through the use of matched fil- 
ters is approximately constant across the field of view. Also, 
whatever the method used, the sensitivity decreases by about 
15% towards the edge of the field of view. This is because the 
less compact the PSF, the more difficult it is to sequester source 
from background counts - the effective background counts in- 
volved with the source are higher. For similar reasons, a de- 
crease in sensitivity is seen in the detection of extended (ie 
non-pointlike) sources. Use of on-axis weights for the off-axis 
signal degrades the sensitivity by about 10% over the whole 
range of background. For values of background lower than 
about 0. 1 counts/pixel the sensitivity using this un-matched fil- 
ter becomes nominally worse than that achievable via the box- 
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Fig. 6. The minimum detectable counts amplitude /3det, plotted 
as a function of background flux B. The solid line represents the 
results obtained by convolving with a 5 x 5 unit array, as used 
in the creation of the IXMM catalog. The remainder of the plot 
refers to a weighted-sum scheme in which the weights are opti- 
mized to the on-axis PSF of the XMM-Newton EPIC FN cam- 
era at 1 .25 ke V, binned up on a 9 by 9 grid of 4 arcsec square 
pixels. The dash-dot line shows the sensitivity predicted by the 
modified Fay and Feuer approximation formula (J3det given by 
equation I10>. The dashed line gives the prediction of the 
formula (JS^^i given by eauationfTTl. The diamonds are samples 
of the 'true' sensitivity of the matched filter method as derived 
from Monte Carlo experiments. The black squares, also Monte 
Carlo corrected, are the result of convolving with the PSF in- 
stead of the optimized convolver. 




Fig. 7. Similar to figure|5lexcept that the PSF at 850 arcsec from 
the edge of the optical axis and azimuth = 45° was chosen. 



convolver method, although no doubt some ground could be 
recovered by correction of the formula via Monte Carlos. 

The variation in the shape of the PSF in XMM-Newton 
EPIC images puts some practical difficulties in the way of 
source detection via matched filters, because one has, in ef- 
fect, to employ a diff'erent convolving kernel for each pixel of 
the image. This was the method adopted by Ueda et al (19991. 
An approximation to this which still allows one to use the use- 
ful mathematical properties of convolution is to divide the im- 
age into patches, convolve each patch separately, then add the 
results. Vikhhnin et al (il995j used this technique. The XMM- 
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Fig. 8. Same as figure|6lexcept that the PSF at 850 arcsec from 
the edge of the optical axis and azimuth - 45° was chosen. 
The solid line again shows the results of the IXMM method. 
Both the dashed and dot-dash lines give weighted-sum results, 
but the dashed line weights were optimized for the correct PSF 
for this field position whereas the weights for the dot-dash line 
were derived by optimizing for the on-axis (ie here incorrect) 
PSF 

Newton SAS (Gabriel et al l2004> task asmooth (Stewart EUH^t 
can also perform such a piece-wise convolution. 

3.2.2. Matched filters for 5 energy bands 

In this case it is no longer possible to invert the IXMM and 
matched-filter methods in the same way, since the summed- 
likelihood approach used to find IXMM sources in 5 bands 
(described in section 13. 1> cannot be expressed as a weighted 
sum of Poissonian integers. However, equations |5] |5] and [O] 
taken together amount to a relationship between the counts am- 
plitude yS and the null-probability PnuW'- hence one can numeri- 
cally invert this relationship to obtain the sensitivity ySdet which 
corresponds to the cutoff probability f det, which is left at e^^ 
as before. 

In order to calculate a matched filter for summing images 
in several energy bands, one must know the relative strength of 
the signal in each band, which amounts to knowing the source 
spectrum. Where sources have a variety of spectral shapes, as is 
the case for cosmic sources of x-rays, the matched-filter tech- 
nique can only be optimized for a single class of sources at 
a time. The performance of the matched filter against an un- 
matched spectrum is examined toward the end of the present 
section. 

For purposes of the present study, weights were optimized 
to detect x-ray sources with an absorbed power-law spectrum 
having a photon index 1 .7 and a HI column density of 3 .0 x 1 0^" 
cm"^. The relative count rates in each band were obtained, via 
the program xspec, by folding this spectrum with the on-axis 
effective area function of the XMM-Newton PN camera. These 
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Table 1. Source and background weights for the XMM-Newton 
PN camera. 



Band enerj 


;ies (keV) 


Background 


Source 


Hard 


Low 


High 


weights: 


weights: 


source: 


0.2 


0.5 


0.0842 


0.1208 


0.0023 


0.5 


2.0 


0.1733 


0.3763 


0.0193 


2.0 


4.5 


0.0941 


0.0943 


0.2993 


4.5 


7.5 


0.0941 


0.0350 


0.6172 


7.5 


12.0 


0.1485 


0.0094 


0.0619 



weights are given in column 4 of table [2 The energy band 
definitions (columns 1 and 2) are those of IXMM. The spec- 
trum of the background (column 3), which is just as impor- 
tant for present purposes as that of the sources, was obtained 
from the IXMM catalog in the following way: for each energy 
band, a 2-dimensional histogram was made of the background 
counts for each source versus the exposure time; the approxi- 
mate minimum background count rate for the band was then es- 
timated from this plot. To enable a check of the efficiency of the 
matched filter at detecting a source of spectrum far from nom- 
inal, weights derived from the EPIC PN count rates of IXMM 
J175921. 7-335322, one of the hardest sources in the IXMM 
catalog, were also obtained: these are given in column 5 of the 
table. 

As noted in section 13.2.11 a value of = 4 for the size 
of the optimized convolvers was chosen as giving the largest 
convolver which was still practical to compute. The optimiza- 
tion with 5 bands is observed to be a little slower than for 
the single-band case, understandable because there are are now 
5 X {2N -H 1)^ = 405 weights which must be optimized in par- 
allel. 

The two approaches are compared in figure|9l Before com- 
menting on the difference in sensitivity between the two meth- 
ods we ought to make sure we have an accurate measure of 
those sensitivities. As shown via a Monte Carlo experiment in 
section im equation^] appears to be a poor approximation at 
a nett background level of 0. 1 counts/pixel. The Monte Carlo 
exercise was repeated for six more background levels logarith- 
mically spaced between 0.01 and 10 counts/pixel: that is, for 
each value of background, a Monte Carlo ensemble was gen- 
erated, a cumulative distribution of the ensemble values L was 
calculated, and finally a function Q{v/2,L) was fitted to this 
distribution curve. The resulting values of v are tabulated in ta- 
ble |2l The 'canonical' value of v in this case is 10 (= 2 x 5 
bands). 

Sensitivity values obtained by replacing the 5 in equation 
I13lbv the appropriate value of v/2 are shown by crosses in fig- 
ure |9] Clearly, use of the unmodified formula exacts a signifi- 
cant sensitivity penalty (because of over-estimation of the rate 
of false positives) at values of background less than about 1 .0 
counts/pixel. 

As was seen in section lT.2.11 the formula used to approx- 
imate the null-probability distribution of weighted-Poisson- 
sum values also tends to diverge from the true distribution at 
low values of background. A similar set of Monte Carlo cor- 
rections was therefore performed for the weighted-sum data. 
Fortuitously, for the 5-band signal presently chosen, the cor- 



Table 2. Values of v obtained by fitting Q(v/2, L) to Monte 
Carlo distributions of L at different background levels. 



Nett background 
(counts/pixel) 


Fitted 

V 


0.01 


4.31 


0.032 


4.21 


0.1 


5.81 


0.32 


7.26 


1.0 


8.22 


3.2 


8.90 


10.0 


9.35 
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Fig. 9. This fij 


;ure compares the theoretical sensitivity of two 



methods of detecting sources given several independent images 
of the same piece of sky. The methods compared here were de- 
signed to look for sources in XMM-Newton EPIC images made 
in 5 separate energy bands. The vertical scale shows the mini- 
mum detectable counts amplitude ySdet. whereas the horizontal 
scale gives the background counts per image pixel, summed 
over all energy bands. The solid line shows the nominal sensi- 
tivity of the source detection scheme used in the creation of the 
IXMM catalog; the dashed line shows the nominal sensitivity 
of the matched-filter method (x^ approximation). The crosses 
show values of IXMM sensitivity corrected via a Monte Carlo 
and fitting approach. Diamonds show similarly corrected sen- 
sitivity values of the matched-filter method. 



rections appear to be insignificant. The reader is cautioned not 
to expect this to be the case for all signal shapes. 

As mentioned in section one may compensate for the 
distortion in the true null likelihood (thus in the rate of false 
positives due to background fluctuations) by changing the value 
of nominal null likelihood used to sort 'sources' from 'non- 
sources' . Such compensating values of null likelihood for seven 
values of background, evenly-spaced on a logarithmic scale, 
are plotted in figureFTOI 

Taking the corrections into account, at high background 
values the matched-filter method appears to offer about 1.6 
times the sensitivity of the IXMM procedure, the advantage de- 
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Fig. 10. Nominal values of detection likelihood which should 
be chosen in order to obtain a true detection likelihood of 8. 
The crosses show the results relevant to the IXMM procedure; 
diamonds refer to the matched-filter method (x^ formula). 



Background flux (counts/pixel) 

Fig. 11. Same as figure|5] except that the sensitivity values were 
calculated using the hard-source weights of column 5 of table 
n Only the Monte-Carlo-coiTected points for the IXMM pro- 
cedure (crosses) and the ;\f^-theory curve for the matched-filter 
procedure (dashed line) are shown. 



creasing gradually to about 1 .2 at the lowest background value 
plotted. 

A hasty comparison of figures |6l and O may lead one to 
conclude that there is, paradoxically, not much advantage to 
be gained by detecting sources over 5 bands rather than 1. 
However, the vertical scales of the two graphs are not equiv- 
alent, because they refer to different signal shapes. To make a 
proper comparison one would need to multiply all the back- 
ground values of figure|9lby 0.1733 (the proportion of the total 
background found in energy band 2) and, for like reasons, all 
the amplitude values by 0.3763. Comparison with figure|6lthen 
reveals an improvement in sensitivity for 5- versus 1-band de- 
tection by about a factor of 2 for the matched-filter method and 
1.5 for the IXMM method. 

As mentioned earlier, a filter which has been optimized to 
detect sources of a particular spectrum may not perform well 
in the detection of sources with very different spectra. To in- 
vestigate this, the exercise of figure |5] was repeated. The filter 
was optimized as before for a source having weights listed in 
column 4 of tabled but the sensitivity values were calculated 
under the assumption that the source weights came from col- 
umn 5. The results are plotted in figurefTTI 

It is apparent that the sensitivity of the matched filter to 
the hard-spectrum source is significantly degraded - indeed by 
a factor of 2 - whereas the detection efliciency of the IXMM 
procedure is if anything slightly improved. Although this hard 
spectrum is the worst case likely to be encountered in practice, 
the results suggest the desirability of using a non-matched pro- 
cedure in parallel. It is however also worth noting in passing 
that, regardless of whether the detection technique is matched 
to a particular source spectrum or not, some spectrum must be 
assumed in order to calculate any kind of multi-band sensitivity 
value. 



3.2.3. A check on the sensitivity results 

So far we have been 'working backwards', inverting expres- 
sions to obtain estimates of the minimum signal detectable un- 
der a variety of conditions. As a check on this procedure, a 
'forwards' Monte Carlo experiment was performed as follows. 
Random data were generated from parent distributions of the 
form of equation[2for a sequence of values of the amplitude a. 
The data were generated in 5 bands, using the PN background 
ratios tabulated in table [l] with a net background flux of 1.0 
counts/pixel. The usual PSFs provided the signal appropriate to 
each band. An ensemble of 10"* mini images was accumulated 
at each of 200 equally-spaced values of alpha. Each 5-band im- 
age stack was submitted to both the IXMM and the matched- 
filter source detection procedures. The detection frequencies as 
functions of a are compared in figure From figure |9] one 
would expect that signals of a > about 16 would be detected 
by the matched-filter approach, as opposed to a cutoff of about 
25 for the (uncoiTected) IXMM approach. Although statistical 
fluctuations mean that signals with a < ySdet for that background 
flux are occasionally detected, and signals with a > y6det are oc- 
casionally not, the results of this Monte Carlo are consistent 
with the earlier analysis. 

4. Conclusions 

It is not possible to detect with certainty a signal superimposed 
on a noisy background where there is some non-zero probabil- 
ity that a combination of random background values can mimic 
the signal: the best that can be done is to calculate a detection 
probability - or its complement, the null or non-detection prob- 
ability. The aim of any method to enhance signal detectability 
is therefore to decrease the null probability of a signal of any 
given amplitude, or, equivalently, to decrease the amplitude at 
which a signal generates a given value of null probability. This 
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Signal amplitude a (counts) 

Fig. 12. Detection sensitivities of the IXMM (solid line) and 
matched-filter (dashed line) methods are compared via a Monte 
Carlo experiment. The vertical scale gives the fraction of 
'sources' at each value of a which were 'detected'. 

must be achieved by performing some transform upon the set 
of measured values of signal plus background plus noise. 

The main thrust of the present paper has been to examine 
that subset of transforms which may be expressed as discrete 
convolutions of the input data. There appears to be nothing 
new to say about the case in which the probability distribu- 
tion of the noise value in any given data channel is Gaussian; 
because of this, attention has been further restricted to the 
Poissonian regime. Two independent approximations to the 
null-probability distribution for a convolution of Poissonian 
data have been described and compared; such approximate for- 
mulas allow one to optimize the convolution for detection of 
signals of any specified shape without performing a Monte 
Carlo on each occasion. 

Although the method described here yields optimum sig- 
nal detection via convolution, the theory says nothing about 
the possibilities or otherwise of obtaining better detection 
sensitivity via other, non-linear transforms. Despite this, the 
optimized-convolution method has been shown to perform sub- 
stantially better over a wide range of background levels than the 
more complicated technique which was employed to detect, in 
XMM-Newton x-ray images, the sources which comprise the 
IXMM catalog. However this is perhaps not surprising in view 
of the fact that the convolution method makes use of more in- 
formation about the signal (in the IXMM case 'information' 
means the average spectrum and point-spread function (PSF) 
of the x-ray sources) than the IXMM method. 

The greater use of information about the signal exposes one 
of the drawbacks of the optimized-convolver method: namely, 
that if the shape of the signal is not known, or can vary, then 
assumptions must be made about it. Any given convolver is 
optimized for only one shape of signal (and for one partic- 
ular level of background). To detect with optimal sensitivity 
signals having a variety of shapes, under a variety of back- 
ground conditions, one would need to repeat the convolution 
with many different convolvers, each tailored to a different sig- 



nal/background combination. In practice this may not be worth 
the extra effort, since in many cases it may happen that accept- 
able results can be obtained by using a relatively small set of 
convolvers. Taking the XMM-Newton x-ray images as an ex- 
ample, it was shown in section 13.2.11 that use of the on-axis 
PSF across the whole field results in only a few percent loss of 
sensitivity even towards the edges of the field of view where 
the PSF becomes significantly elongated in the azimuthal di- 
rection. In addition, the form of the optimal convolver appears 
to be relatively insensitive to background level. As regards the 
x-ray spectrum, it should be remembered that the weights tab- 
ulated in table [2 represent x-ray spectra after convolution by 
the strongly peaked response function of the instrument, and 
their trend is therefore dominated by that response function. 
Sources with quite different spectra may thus be expected to 
yield similar sets of spectral weights; for example, the bulk 
of x-ray sources in IXMM exhibit their highest and lowest 
fluxes in bands 2 and 5 respectively. However it is probably 
desirable in practice to supplement the full matched-filter pro- 
cedure with a non-spectrum-specific detection algorithm. The 
IXMM sliding-box method, perhaps with the 'boxes' replaced 
by PSF-matched convolvers for increased sensitivity, is a pos- 
sible choice for the latter. 

Finally, some discrimination between signals may be de- 
sirable, since not all signal shapes represent sources we would 
wish to detect. In the x-ray case, whereas the IXMM detec- 
tion method cannot discriminate between an x-ray source and a 
bright pixel on the CCD^, the matched-filter method does offer 
some degree of selection against bright pixels and other arti- 
facts. 

Another caveat to be mentioned is the fact that both the 
formulas presented for null probability of a weighted sum of 
Poissonian data are only approximate; and what is perhaps 
worse, no analysis has yet been presented which would allow 
one to estimate the goodness of the approximations. In this 
case one can only fall back on probability curves derived from 
Monte Carlo data with which to calibrate the approximation 
formulae. A good example of the desirability of checks of this 
kind is the large gap demonstrated between the null-probability 
approximation used for IXMM source detection and the results 
of Monte Carlos at a variety of levels of background. Although 
the approximations for the matched-filter method do not appear 
to be nearly so unsatisfactory, they are not immune from diffi- 
culties of this sort and should be subject to similar checks in 
practice. 

A natural extension of the convolver method as applied 
to x-ray source detection is to allow one to coiTectly add to- 
gether overlapping images. One's first impulse in this situa- 
tion is simply to add the images without weighting, but (cer- 
tainly in the XMM-Newton case) because the background 
rate can vary with time, separate images may have different 
source/background ratios and should therefore be weighted ac- 
cordingly. The theory described in this paper allows one to es- 
timate the optimum weights for such combinations. 

' a further characterisation step in the IXMM chain, employing the 
SAS task emldetect, does however (in principle) allow the two to be 
discriminated. 
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The author hopes to make the matched-filter method avail- 
able in the edetect package of the v-6.5 release of the XMM 
SAS. 

Appendix A: The x-ray background 

As mentioned in the introduction, there is good evidence that 
the bulk of the cosmological x-ray background consists of nu- 
merous faint discrete sources. The present paper assumes how- 
ever that the x-ray sky may be modelled by sparse, relatively 
bright sources (though not all necessarily detectable at any 
given exposure duration) upon a relatively smooth background. 
It is therefore of interest to see how consistent this model is 
with our current understanding of the real sky. 

The fundamental quantity to deal with is the probabil- 
ity distribution of detected flux, where the ensemble is taken 
to consist of measurements at random directions in the sky; 
non-source background is neglected and it is assumed that 
each measurement is free from other sorts of variation, eg the 
Poissonian detection noise we have been discussing so far. The 
connection between the distribution of source flux and the dis- 
tribution of detected source counts depends in a mathemati- 
cally complicated way on the 'beam shape', or PSF in our case. 
Useful treatments of the topic can be found in Condon ( 11974 ,) 
and Scheuer ( 119741 . 

For present purposes it is enough to consider the standard 
deviation of such a probability distribution. Let us consider the 
case in which the distribution of source flux n(S ) obeys a power 
law, viz: 

n{S) = kS'^. 

Condon derives the following formula for standard deviation cr 
in this case:^ 



(A.l) 



where Cmax is some upper limit on the detected source counts 
and the equivalent solid angle of the PSF, is defined as 



(A.2) 



Note that cr is unbounded for the entire probability distribution 
(ie, as Cmax ^ 

As for n(S ), the measurements described in Mushotzky et 
al ((2000') are consistent with a 'two-slope' model in which, for 
energies from 0.5 to 2 keV (corresponding to energy band 2 of 
IXMM), y has flie value 1.7 below 5 = 7 x 10"'^ erg cm"^ s"' 
and 2.5 above it. The k values can be evaluated from 



ffaint 



'^bright 



^i-rbngh, ^ 200 deg" 



1 - Tfaint 1 - Tbright 

where S is the flux at the 'knee'. 

* Actually Condon's equation (13) appears to be incorrect: accord- 
ing to my working, the dI^^ should be inside the square root. In addi- 
tion, the condition y > 2, although necessary for others of his results, 
is not required here. 



Which y to choose? 200 sources per square degree, the in- 
tegrated number density at the knee, yields about 30 sources 
per EPIC PN field. This is quite a typical number of serendip- 
itous sources to find in such fields: hence one can say that the 
PN camera will, in a typical exposure, detect sources fainter 
than the flux at the knee. I therefore choose a value of 1 .7 for y. 
Substitution of this value into equation lA.2l vields 2.34 x 10"^ 
deg^ for the equivalent area of the on-axis, band 2 EPIC PN 
PSF 

At this point it is convenient to convert fluxes S to counts 
C. The highest value of background considered in the present 
paper is 10 counts pixel"'. By use of the same histogram tech- 
nique as described in section 13.2.21 one may deduce that the 
typical background count rate for EPIC PN in IXMM band 
2 is about 2 x 10"^ counts pixel"' s"'. A background value 
of 10 counts pixel"' thus corresponds to an exposure time of 
5 X 10^ s, which is 5 times longer than the longest (PN) ex- 
posure time in the IXMM catalog. Some multi-epoch obser- 
vations made with XMM-Newton may approach this duration 
however; hence I take it as a reasonable upper limit to practical 
XMM-Newton observations. The flux to count-rate conversion 
factor for IXMM band 2 (calculated in the same exercise as the 
source weights tabulated in tabled is 7.50 x 10" counts erg"' 
cm^; the 'knee' in the Mushotzky soft-band logA^-log5 diagram 
thus falls, for a 5 x 10^ s exposure time, at ~2600 counts in 
IXMM band 2 and (making use of the source weights in table 
Q 7000 counts in the total band. Comparison of these num- 
bers with figure|5]shows that even the IXMM algorithm could 
detect sources 2 orders of magnitude fainter than the 'knee' 
at this exposure duration; clearly the estimate in the previous 
paragraph that XMM-Newton is capable of seeing far past the 
'knee' in the logA^-log5 curve is correct. 

One finds that A:faint in these units works out to be 6.88 x 10"*. 
We are now in a position to use equation lA.il to calculate the 
'standard deviation' cr of the noise in the ensemble of measure- 
ments. Eauation lA. 1 [ evaluates to 

cr^l.llxC:;;^^ 

It only remains to select a value for Cmax- Suppose we choose 
Cmax = 47 counts, which from figure |9] is the detection sen- 
sitivity obtainable (under the assumption that the background 
is flat) at this exposure length using the matched-filter algo- 
rithm described in the present paper, cr evaluates to ~ 1 3 counts, 
which is several times larger than the standard deviation VTO 
of the Poisson noise. However, the logA^-log5 model indicates 
that the total source density at this counts value is 6640 deg"^, 
which is still only 0.16 sources per PSF equivalent area Qg. 
The counts value at which one expects 1 source in total per Qe 
is 3.3; using this for Cmax yields 2.4 counts for cr instead, just 
lower than the Poisson noise. Thus we may conclude that, for 
XMM-Newton exposures of total duration up to 5 x 10^ s in 
length, the x-ray sky may still be modelled to acceptable accu- 
racy by a flat background with superposed sources. It is clear 
though that the next generation of x-ray telescopes may not 
have life so easy. 

Finally, a word about background estimation. So far in the 
present paper it has been assumed that the background con- 
tribution is known. Background can be difficult to estimate. 
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though - to calculate the background one must first excise or 
avoid the sources, but it is difficult to find sources without first 
having a knowledge of the background. However, one can con- 
ceive of an iterative process in which the detectable sources are 
gradually detected and excised in parallel with improvements 
in the background estimate. In the present case that would still 
leave a population of sources which are too faint to be detected 
but which are brighter than the confusion level calculated in the 
preceding paragraph. The sum of these faint sources will bias 
the background estimate upward. The average counts deg"^ 
contributed by these sources is 

dC C n{C). 

For a single-power-law n - kC^'^ this gives 

If we use 6.88 x 10"^ for k, 1.7 for j, 47 for Qet and 3.3 for 
Cconf, ip evaluates to 4 x 10^ counts deg"^, or 0.5 counts pixel"' 
for 4 arcsec square pixels. A similar calculation can of course 
be performed for any other exposure duration. 
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